Jacobian Regularization for Mitigating Universal Adversarial Perturbations
نویسندگان
چکیده
Universal Adversarial Perturbations (UAPs) are input perturbations that can fool a neural network on large sets of data. They class attacks represents significant threat as they facilitate realistic, practical, and low-cost networks. In this work, we derive upper bounds for the effectiveness UAPs based norms data-dependent Jacobians. We empirically verify Jacobian regularization greatly increases model robustness to by up four times whilst maintaining clean performance. Our theoretical analysis also allows us formulate metric strength shared adversarial between pairs inputs. apply benchmark datasets show it is highly correlated with actual observed robustness. This suggests realistic practical universal be reliably mitigated without sacrificing accuracy, which shows promise machine learning systems.
منابع مشابه
Towards Mitigating Audio Adversarial Perturbations
Audio adversarial examples targeting automatic speech recognition systems have recently been made possible in different tasks, such as speech-to-text translation and speech classification. Here we aim to explore the robustness of these audio adversarial examples generated via two attack strategies by applying different signal processing methods to recover the original audio sequence. In additio...
متن کاملDefense against Universal Adversarial Perturbations
Recent advances in Deep Learning show the existence of image-agnostic quasi-imperceptible perturbations that when applied to ‘any’ image can fool a state-of-the-art network classifier to change its prediction about the image label. These ‘Universal Adversarial Perturbations’ pose a serious threat to the success of Deep Learning in practice. We present the first dedicated framework to effectivel...
متن کاملAnalysis of universal adversarial perturbations
Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers. In this paper, we propose a quantitative analysis of the robustness of classifiers to universal perturbations, and draw a formal link between the robustness to universal perturbations, and ...
متن کاملLearning Universal Adversarial Perturbations with Generative Models
Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally perturbed to remain visually similar to the source input, but cause a misclassification. It was recently shown that given a dataset and classifier, there exists so called universal adversarial perturbations, a single perturbation that causes a misclassification when applied to any input. In t...
متن کاملImproving DNN Robustness to Adversarial Attacks using Jacobian Regularization
Deep neural networks have lately shown tremendous performance in various applications including vision and speech processing tasks. However, alongside their ability to perform these tasks with such high accuracy, it has been shown that they are highly susceptible to adversarial attacks: a small change of the input would cause the network to err with high confidence. This phenomenon exposes an i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-86380-7_17